Discovery of Multiple - Level Rules Fromlarge
نویسندگان
چکیده
With the widespread computerization in business, government, and science, the e cient and e ective discovery of interesting information from large databases becomes essential. Data mining or Knowledge Discovery in Database (KDD) emerges as a solution to the data analysis problems faced by many organizations. Previous studies on data mining have been focused on the discovery of knowledge at a single conceptual level, either at the primitive level or at a rather high conceptual level. However, it is often desirable to discover knowledge at multiple conceptual levels, which will provide a spectrum of understanding, from general to speci c, for the underlying data. In this thesis, we rst introduce the conceptual hierarchy, a hierarchical organization of the data in the databases. Two algorithms for dynamic adjustment of conceptual hierarchies are developed, as well as another algorithm for automatic generation of conceptual hierarchies for numerical attributes. In addition, a set of algorithms is developed for mining multiple-level characteristic, discriminant and association rules. All algorithms developed were implemented and tested in our data mining prototype system, DBMiner. The attribute-oriented induction method is extended to discover multiple-level characteristic and discriminant rules. A progressive deepening method is proposed for mining multiple-level association rules. Several variants of the method with di erent optimization techniques are implemented and tested. The results show the method is e cient and e ective. Furthermore, a new approach to association rule mining, meta-rule guided mining, is proposed. The experiments show that meta-rule guided mining is powerful and e cient. Finally, an application of data mining techniques, cooperative query answering using multiple layered databases, is presented. Our study concludes that mining knowledge at multiple levels is both practical and desirable, and thus is an interesting research direction. Some future research problems are also discussed. iii Dedication To my parents and my wife. iv Acknowledgements I would like to thank Prof. Jiawei Han, my senior supervisor, for his continuous help, encouragement, and support, during my study. Prof. Han always nd time in his busy schedule for frequent discussions with me and his creative thinking and insight make our discussions fruitful and interesting. My endeavors would not have been successful without him. I would also like to thank Prof. Veronica Dahl for serving on my supervisory committee. Prof. Dahl gave me good advice on my research. My deepest thanks to Prof. Tiko Kameda and Prof. Len Shapiro for serving as examiners of this thesis. I wish to express my gratitude to many people in the School of Computing Science. Prof. Lou Hafer, Prof. Arvind Gupta, and Prof. Qiang Yang provided various kinds of help when they were most needed. Mrs. Kersti Jaager and other secretaries were always available for help. Micheline Kamber read this thesis and gave many useful comments. The discussions with Prof. David Cheung of Hong Kong University helped me understand more about the mining of association rules. My thanks also go to many fellow graduate students who made my days at SFU enjoyable: Krzysztof (Kris) Koperski, Wei Wang, Osmar Zaiane, Andrew Fall, Jiashua Liu, Tong Lu, Jie Wei, Yao Liang, Martin Vorbeck, Jenny Chiang, Hui Li, Wan Gong, Yijun Lu, Nebojsa Stefanovic, Betty Xia, Hongshen Chin, and Ye Lu. I am very grateful to my wife, Lei Jiang, who always energizes me with her love, understanding, and support. I am also indebted to my parents for their everlasting understanding and encouragement. v
منابع مشابه
Discovery of Multiple-Level Association Rules from Large Databases
Discovery of association rules from large databases has been a focused topic recently in the research into database mining. Previous studies discover association rules at a single concept level, however, mining association rules at multiple concept levels may lead to nding more informative and re ned knowledge from data. In this paper, we study e cient methods for mining multiple-level associat...
متن کاملMining Multiple-Level Association Rules in Large Databases
ÐA top-down progressive deepening method is developed for efficient mining of multiple-level association rules from large transaction databases based on the Apriori principle. A group of variant algorithms is proposed based on the ways of sharing intermediate results, with the relative performance tested and analyzed. The enforcement of different interestingness measurements to find more intere...
متن کاملFP-tree and COFI Based Approach for Mining of Multiple Level Association Rules in Large Databases
In recent years, discovery of association rules among itemsets in a large database has been described as an important database-mining problem. The problem of discovering association rules has received considerable research attention and several algorithms for mining frequent itemsets have been developed. Many algorithms have been proposed to discover rules at single concept level. However, mini...
متن کاملThe Introduction of a Heuristic Mutation Operator to Strengthen the Discovery Component of XCS
The extended classifier systems (XCS) by producing a set of rules is (classifier) trying to solve learning problems as online. XCS is a rather complex combination of genetic algorithm and reinforcement learning that using genetic algorithm tries to discover the encouraging rules and value them by reinforcement learning. Among the important factors in the performance of XCS is the possibility to...
متن کاملThe Introduction of a Heuristic Mutation Operator to Strengthen the Discovery Component of XCS
The extended classifier systems (XCS) by producing a set of rules is (classifier) trying to solve learning problems as online. XCS is a rather complex combination of genetic algorithm and reinforcement learning that using genetic algorithm tries to discover the encouraging rules and value them by reinforcement learning. Among the important factors in the performance of XCS is the possibility to...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1996